Reinforcement Learning for Robot Obstacle Avoidance and Wall Following

نویسنده

  • Chi Zhang
چکیده

Reinforcement learning approach is learning how to map environment situations to actions, to maximize a reward signal. In this project, the SARSA(l) algorithm is applied to teach a robot to follow a wall and avoid running into obstacles. The robot is equipped with a laser scanner and odometry to perform the learning. The robot senses environment situation from the laser readings and decides which action to take by the learned policy. Then a reward is given to the robot according to the new state, and the policy is updated. In this project, the SARSA(l) reinforcement learning method successfully taught the robot to learn the task after around 10 minutes training. Firstly, the resolution of states effected learning speed. If there are too many states, the robot would take a long learning time. Actions effected the quality of learning due to that more choice of actions resulted in more degrees of freedom for the robot to move sophisticatedly. Secondly, the reward function impacted the Q table in a complicated way. If the reward definition was too complex, there might be some conflicting gains and it would be difficult to get a stable Q table. A concise reward function benefited to shorter learning time. Thirdly, exploration and exploitation was investigated by different greedy thresholds. Decaying the greedy value was necessary for the robot to explore at early stage and drive on correct path later. To verify the effectiveness of my design, I tested the robot in another map, and found that the robot took longer time to learn the more complex map, but can still successfully learn the skills. I also explore the impact of level of momentum, which is different level of Markov Decision Process, by tuning the value of l.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...

متن کامل

Time Variable Reinforcement Learning and Reinforcement Function Design

We introduce the mathematical model for time variable reinforcement learning. The policy, the rewards or reinforcement function and the transition probabilities may depend on the progress of the time t. We prove that under certain conditions slightly changed methods of classical dynamic programming assure finding the optimal policy. For that we deduct the Bellman equation for the time variable ...

متن کامل

A reactive navigation method based on an incremental learning of tasks sequences

Within the context of learning sequences of basic tasks to build a complex behavior a method is proposed to coordinate a hierarchical set of tasks Each one pos sesses a set of sub tasks lower in the hierarchy which must be coordinated to respect a binary perceptive con straint For each task the coordination is achieved by a reinforcement learning inspired algorithm based on an heuristic which d...

متن کامل

Self-learning navigation algorithm for vision-based mobile robots using machine learning algorithms

Many mobile robot navigation methods use, among others, laser scanners, ultrasonic sensors, vision cameras for detecting obstacles and following paths. However, humans use only visual (e.g. eye) information for navigation. In this paper, we propose a mobile robot control method based on machine learning algorithms which use only camera vision. To efficiently define the state of the robot from r...

متن کامل

Improving Reinforcement Learning of an Obstacle Avoidance Behavior with Forbidden Sequences of Actions

This paper is concerned with the improvement of reinforcement learning through the use of forbidden sequences of actions. A given reinforcement function can generate multiple effective behaviors. Each behavior is effective only considering the cumulative reward over time. It may not be the behavior expected by the designer. In this case, the usual solution is to modified the reinforcement funct...

متن کامل

Q-learning for Robots

Robot learning is a challenging – and somewhat unique – research domain. If a robot behavior is defined as a mapping between situations that occurred in the real world and actions to be accomplished, then the supervised learning of a robot behavior requires a set of representative examples (situation, desired action). In order to be able to gather such learning base, the human operator must hav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012